{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c024bfa4-1a7a-4751-b5a1-827225a3478b",
   "metadata": {
    "id": "c024bfa4-1a7a-4751-b5a1-827225a3478b"
   },
   "source": [
    "<table style=\"width:100%\">\n",
    "<tr>\n",
    "<td style=\"vertical-align:middle; text-align:left;\">\n",
    "<font size=\"2\">\n",
    "Supplementary code for the <a href=\"http://mng.bz/orYv\">Build a Large Language Model From Scratch</a> book by <a href=\"https://sebastianraschka.com\">Sebastian Raschka</a><br>\n",
    "<br>Code repository: <a href=\"https://github.com/rasbt/LLMs-from-scratch\">https://github.com/rasbt/LLMs-from-scratch</a>\n",
    "</font>\n",
    "</td>\n",
    "<td style=\"vertical-align:middle; text-align:left;\">\n",
    "<a href=\"http://mng.bz/orYv\"><img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/cover-small.webp\" width=\"100px\"></a>\n",
    "</td>\n",
    "</tr>\n",
    "</table>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58b8c870-fb72-490e-8916-d8129bd5d1ff",
   "metadata": {
    "id": "58b8c870-fb72-490e-8916-d8129bd5d1ff"
   },
   "source": [
    "# Appendix E: Parameter-efficient Finetuning with LoRA"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "5b7e01c2-1c84-4f2a-bb51-2e0b74abda90",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "5b7e01c2-1c84-4f2a-bb51-2e0b74abda90",
    "outputId": "316166b4-027a-4756-e9b4-fe88ae75dd4f"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "matplotlib version: 3.10.5\n",
      "numpy version: 2.0.2\n",
      "tiktoken version: 0.11.0\n",
      "torch version: 2.8.0\n",
      "tensorflow version: 2.20.0\n",
      "pandas version: 2.3.1\n"
     ]
    }
   ],
   "source": [
    "from importlib.metadata import version\n",
    "\n",
    "pkgs = [\"matplotlib\",\n",
    "        \"numpy\",\n",
    "        \"tiktoken\",\n",
    "        \"torch\",\n",
    "        \"tensorflow\", # For OpenAI's pretrained weights\n",
    "        \"pandas\"      # Dataset loading\n",
    "       ]\n",
    "for p in pkgs:\n",
    "    print(f\"{p} version: {version(p)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21532056-0ef4-4c98-82c7-e91f61c6485e",
   "metadata": {
    "id": "21532056-0ef4-4c98-82c7-e91f61c6485e"
   },
   "source": [
    "## E.1 Introduction to LoRA"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66edc999-3d91-4a1c-a157-9d056392e8d8",
   "metadata": {
    "id": "66edc999-3d91-4a1c-a157-9d056392e8d8"
   },
   "source": [
    "- No code in this section\n",
    "- Low-rank adaptation (LoRA) is a machine learning technique that modifies a pretrained model to better suit a specific, often smaller, dataset by adjusting only a small, low-rank subset of the model's parameters\n",
    "- This approach is important because it allows for efficient finetuning of large models on task-specific data, significantly reducing the computational cost and time required for finetuning"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5bb75b5d-d59c-4948-821a-1594a5883dc1",
   "metadata": {
    "id": "5bb75b5d-d59c-4948-821a-1594a5883dc1"
   },
   "source": [
    "- Suppose we have a large weight matrix $W$ for a given layer\n",
    "- During backpropagation, we learn a $\\Delta W$ matrix, which contains information on how much we want to update the original weights to minimize the loss function during training\n",
    "- In regular training and finetuning, the weight update is defined as follows:\n",
    "\n",
    "$$W_{\\text{updated}} = W + \\Delta W$$\n",
    "\n",
    "- The LoRA method proposed by [Hu et al.](https://arxiv.org/abs/2106.09685) offers a more efficient alternative to computing the weight updates $\\Delta W$ by learning an approximation of it, $\\Delta W \\approx AB$.\n",
    "- In other words, in LoRA, we have the following, where $A$ and $B$ are two small weight matrices:\n",
    "\n",
    "$$W_{\\text{updated}} = W + AB$$\n",
    "\n",
    "- The figure below illustrates these formulas for full finetuning and LoRA side by side"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8a7419d-cae9-4525-bb44-1641f6ef4f3b",
   "metadata": {
    "id": "a8a7419d-cae9-4525-bb44-1641f6ef4f3b"
   },
   "source": [
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/appendix-e_compressed/lora-1.webp\" width=\"500px\">"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4edd43c9-8ec5-48e6-b3fc-5fb3c16037cc",
   "metadata": {
    "id": "4edd43c9-8ec5-48e6-b3fc-5fb3c16037cc"
   },
   "source": [
    "- If you paid close attention, the full finetuning and LoRA depictions in the figure above look slightly different from the formulas I have shown earlier\n",
    "- That's due to the distributive law of matrix multiplication: we don't have to add the weights with the updated weights but can keep them separate\n",
    "- For instance, if $x$ is the input data, then we can write the following for regular finetuning:\n",
    "\n",
    "$$x (W+\\Delta W) = x W + x \\Delta W$$\n",
    "\n",
    "- Similarly, we can write the following for LoRA:\n",
    "\n",
    "$$x (W+A B) = x W + x A B$$\n",
    "\n",
    "- The fact that we can keep the LoRA weight matrices separate makes LoRA especially attractive\n",
    "- In practice, this means that we don't have to modify the weights of the pretrained model at all, as we can apply the LoRA matrices on the fly\n",
    "- After setting up the dataset and loading the model, we will implement LoRA in the code to make these concepts less abstract"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c7017a2-32aa-4002-a2f3-12aac293ccdf",
   "metadata": {
    "id": "8c7017a2-32aa-4002-a2f3-12aac293ccdf"
   },
   "source": [
    "## E.2 Preparing the dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "669c64df-4431-4d27-834d-2bb38a01fc02",
   "metadata": {
    "id": "669c64df-4431-4d27-834d-2bb38a01fc02"
   },
   "source": [
    "- This section repeats the code from chapter 6 to load and prepare the dataset\n",
    "- Instead of repeating this code, one could open and run the chapter 6 notebook and then insert the LoRA code from section E.4 there\n",
    "- (The LoRA code was originally the last section of chapter 6 but was moved to the appendix due to the length of chapter 6)\n",
    "- In a similar fashion, we could also apply LoRA to the models in chapter 7 for instruction finetuning"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "def7c09b-af9c-4216-90ce-5e67aed1065c",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "def7c09b-af9c-4216-90ce-5e67aed1065c",
    "outputId": "a67a7afe-b401-4463-c731-87025d20f72d"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sms_spam_collection/SMSSpamCollection.tsv already exists. Skipping download and extraction.\n"
     ]
    }
   ],
   "source": [
    "# import urllib\n",
    "import requests\n",
    "from pathlib import Path\n",
    "import pandas as pd\n",
    "from previous_chapters import (\n",
    "    download_and_unzip_spam_data,\n",
    "    create_balanced_dataset,\n",
    "    random_split\n",
    ")\n",
    "# If the `previous_chapters.py` file is not available locally,\n",
    "# you can import it from the `llms-from-scratch` PyPI package.\n",
    "# For details, see: https://github.com/rasbt/LLMs-from-scratch/tree/main/pkg\n",
    "# E.g.,\n",
    "# from llms_from_scratch.ch06 import (\n",
    "#     download_and_unzip_spam_data,\n",
    "#     create_balanced_dataset,\n",
    "#     random_split\n",
    "# )\n",
    "\n",
    "\n",
    "\n",
    "url = \"https://archive.ics.uci.edu/static/public/228/sms+spam+collection.zip\"\n",
    "zip_path = \"sms_spam_collection.zip\"\n",
    "extracted_path = \"sms_spam_collection\"\n",
    "data_file_path = Path(extracted_path) / \"SMSSpamCollection.tsv\"\n",
    "\n",
    "\n",
    "try:\n",
    "    download_and_unzip_spam_data(url, zip_path, extracted_path, data_file_path)\n",
    "except (requests.exceptions.RequestException, TimeoutError) as e:\n",
    "    print(f\"Primary URL failed: {e}. Trying backup URL...\")\n",
    "    url = \"https://f001.backblazeb2.com/file/LLMs-from-scratch/sms%2Bspam%2Bcollection.zip\"\n",
    "    download_and_unzip_spam_data(url, zip_path, extracted_path, data_file_path)\n",
    "\n",
    "# The book originally used\n",
    "# except (urllib.error.HTTPError, urllib.error.URLError, TimeoutError) as e:\n",
    "# in the code above.\n",
    "# However, some VPN users reported issues with `urllib`, so the code was updated\n",
    "# to use `requests` instead\n",
    "\n",
    "df = pd.read_csv(data_file_path, sep=\"\\t\", header=None, names=[\"Label\", \"Text\"])\n",
    "balanced_df = create_balanced_dataset(df)\n",
    "balanced_df[\"Label\"] = balanced_df[\"Label\"].map({\"ham\": 0, \"spam\": 1})\n",
    "\n",
    "train_df, validation_df, test_df = random_split(balanced_df, 0.7, 0.1)\n",
    "train_df.to_csv(\"train.csv\", index=None)\n",
    "validation_df.to_csv(\"validation.csv\", index=None)\n",
    "test_df.to_csv(\"test.csv\", index=None)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "74c3c463-8763-4cc0-9320-41c7eaad8ab7",
   "metadata": {
    "id": "74c3c463-8763-4cc0-9320-41c7eaad8ab7"
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import tiktoken\n",
    "from previous_chapters import SpamDataset\n",
    "\n",
    "\n",
    "tokenizer = tiktoken.get_encoding(\"gpt2\")\n",
    "train_dataset = SpamDataset(\"train.csv\", max_length=None, tokenizer=tokenizer)\n",
    "val_dataset = SpamDataset(\"validation.csv\", max_length=train_dataset.max_length, tokenizer=tokenizer)\n",
    "test_dataset = SpamDataset(\"test.csv\", max_length=train_dataset.max_length, tokenizer=tokenizer)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "8681adc0-6f02-4e75-b01a-a6ab75d05542",
   "metadata": {
    "id": "8681adc0-6f02-4e75-b01a-a6ab75d05542"
   },
   "outputs": [],
   "source": [
    "from torch.utils.data import DataLoader\n",
    "\n",
    "num_workers = 0\n",
    "batch_size = 8\n",
    "\n",
    "torch.manual_seed(123)\n",
    "\n",
    "train_loader = DataLoader(\n",
    "    dataset=train_dataset,\n",
    "    batch_size=batch_size,\n",
    "    shuffle=True,\n",
    "    num_workers=num_workers,\n",
    "    drop_last=True,\n",
    ")\n",
    "\n",
    "val_loader = DataLoader(\n",
    "    dataset=val_dataset,\n",
    "    batch_size=batch_size,\n",
    "    num_workers=num_workers,\n",
    "    drop_last=False,\n",
    ")\n",
    "\n",
    "test_loader = DataLoader(\n",
    "    dataset=test_dataset,\n",
    "    batch_size=batch_size,\n",
    "    num_workers=num_workers,\n",
    "    drop_last=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab7335db-e0bb-4e27-80c5-eea11e593a57",
   "metadata": {
    "id": "ab7335db-e0bb-4e27-80c5-eea11e593a57"
   },
   "source": [
    "- As a verification step, we iterate through the data loaders and check that the batches contain 8 training examples each, where each training example consists of 120 tokens"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "4dee6882-4c3a-4964-af15-fa31f86ad047",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "4dee6882-4c3a-4964-af15-fa31f86ad047",
    "outputId": "2ae34de1-dd01-4f99-d2c8-ba4dca400754"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train loader:\n",
      "Input batch dimensions: torch.Size([8, 120])\n",
      "Label batch dimensions torch.Size([8])\n"
     ]
    }
   ],
   "source": [
    "print(\"Train loader:\")\n",
    "for input_batch, target_batch in train_loader:\n",
    "    pass\n",
    "\n",
    "print(\"Input batch dimensions:\", input_batch.shape)\n",
    "print(\"Label batch dimensions\", target_batch.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5cdd7947-7039-49bf-8a5e-c0a2f4281ca1",
   "metadata": {
    "id": "5cdd7947-7039-49bf-8a5e-c0a2f4281ca1"
   },
   "source": [
    "- Lastly, let's print the total number of batches in each dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "IZfw-TYD2zTj",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "IZfw-TYD2zTj",
    "outputId": "4d19ed61-cf7a-4ec4-b822-c847dd1c5d77"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "130 training batches\n",
      "19 validation batches\n",
      "38 test batches\n"
     ]
    }
   ],
   "source": [
    "print(f\"{len(train_loader)} training batches\")\n",
    "print(f\"{len(val_loader)} validation batches\")\n",
    "print(f\"{len(test_loader)} test batches\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dec9aa4a-ffd2-4d9f-a835-cce1059fe604",
   "metadata": {
    "id": "dec9aa4a-ffd2-4d9f-a835-cce1059fe604"
   },
   "source": [
    "## E.3 Initializing the model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f36ebdaf-810e-46a2-9ad9-e017a04051b1",
   "metadata": {
    "id": "f36ebdaf-810e-46a2-9ad9-e017a04051b1"
   },
   "source": [
    "- This section repeats the code from chapter 6 to load and prepare the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "02b3a506-3879-4258-82b5-93a5b6bafa74",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "02b3a506-3879-4258-82b5-93a5b6bafa74",
    "outputId": "b8c9b125-bb52-45d3-8071-fa5054dbf5a9"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "File already exists and is up-to-date: gpt2/124M/checkpoint\n",
      "File already exists and is up-to-date: gpt2/124M/encoder.json\n",
      "File already exists and is up-to-date: gpt2/124M/hparams.json\n",
      "File already exists and is up-to-date: gpt2/124M/model.ckpt.data-00000-of-00001\n",
      "File already exists and is up-to-date: gpt2/124M/model.ckpt.index\n",
      "File already exists and is up-to-date: gpt2/124M/model.ckpt.meta\n",
      "File already exists and is up-to-date: gpt2/124M/vocab.bpe\n"
     ]
    }
   ],
   "source": [
    "from gpt_download import download_and_load_gpt2\n",
    "from previous_chapters import GPTModel, load_weights_into_gpt\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch04 import GPTModel\n",
    "# from llms_from_scratch.ch05 import load_weights_into_gpt\n",
    "\n",
    "\n",
    "\n",
    "CHOOSE_MODEL = \"gpt2-small (124M)\"\n",
    "INPUT_PROMPT = \"Every effort moves\"\n",
    "\n",
    "BASE_CONFIG = {\n",
    "    \"vocab_size\": 50257,     # Vocabulary size\n",
    "    \"context_length\": 1024,  # Context length\n",
    "    \"drop_rate\": 0.0,        # Dropout rate\n",
    "    \"qkv_bias\": True         # Query-key-value bias\n",
    "}\n",
    "\n",
    "model_configs = {\n",
    "    \"gpt2-small (124M)\": {\"emb_dim\": 768, \"n_layers\": 12, \"n_heads\": 12},\n",
    "    \"gpt2-medium (355M)\": {\"emb_dim\": 1024, \"n_layers\": 24, \"n_heads\": 16},\n",
    "    \"gpt2-large (774M)\": {\"emb_dim\": 1280, \"n_layers\": 36, \"n_heads\": 20},\n",
    "    \"gpt2-xl (1558M)\": {\"emb_dim\": 1600, \"n_layers\": 48, \"n_heads\": 25},\n",
    "}\n",
    "\n",
    "BASE_CONFIG.update(model_configs[CHOOSE_MODEL])\n",
    "\n",
    "model_size = CHOOSE_MODEL.split(\" \")[-1].lstrip(\"(\").rstrip(\")\")\n",
    "settings, params = download_and_load_gpt2(model_size=model_size, models_dir=\"gpt2\")\n",
    "\n",
    "model = GPTModel(BASE_CONFIG)\n",
    "load_weights_into_gpt(model, params)\n",
    "model.eval();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "252614cd-7ce6-4908-83e6-3761f519904e",
   "metadata": {
    "id": "252614cd-7ce6-4908-83e6-3761f519904e"
   },
   "source": [
    "- To ensure that the model was loaded corrected, let's double-check that it generates coherent text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8b6ce20c-0700-4783-8be0-4cf17c200a7f",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "8b6ce20c-0700-4783-8be0-4cf17c200a7f",
    "outputId": "28ccbca5-8de9-41a0-c093-da00fcbaa91c"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Every effort moves you forward.\n",
      "\n",
      "The first step is to understand the importance of your work\n"
     ]
    }
   ],
   "source": [
    "from previous_chapters import (\n",
    "    generate_text_simple,\n",
    "    text_to_token_ids,\n",
    "    token_ids_to_text\n",
    ")\n",
    "\n",
    "\n",
    "text_1 = \"Every effort moves you\"\n",
    "\n",
    "token_ids = generate_text_simple(\n",
    "    model=model,\n",
    "    idx=text_to_token_ids(text_1, tokenizer),\n",
    "    max_new_tokens=15,\n",
    "    context_size=BASE_CONFIG[\"context_length\"]\n",
    ")\n",
    "\n",
    "print(token_ids_to_text(token_ids, tokenizer))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8174b31b-1ab5-4115-b01c-245369da5af3",
   "metadata": {
    "id": "8174b31b-1ab5-4115-b01c-245369da5af3"
   },
   "source": [
    "- Then, we prepare the model for classification finetuning similar to chapter 6, where we replace the output layer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "e255ce91-d73a-4854-90a4-95804928eb16",
   "metadata": {
    "id": "e255ce91-d73a-4854-90a4-95804928eb16"
   },
   "outputs": [],
   "source": [
    "torch.manual_seed(123)\n",
    "\n",
    "num_classes = 2\n",
    "model.out_head = torch.nn.Linear(in_features=768, out_features=num_classes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "02e6f057-1383-4ece-8444-0a88e71ac75d",
   "metadata": {
    "id": "02e6f057-1383-4ece-8444-0a88e71ac75d"
   },
   "outputs": [],
   "source": [
    "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "\n",
    "# Note:\n",
    "# Uncommenting the following lines will allow the code to run on Apple Silicon chips, if applicable,\n",
    "# which is approximately 1.2x faster than on an Apple CPU (as measured on an M3 MacBook Air).\n",
    "# However, the resulting loss values may be slightly different.\n",
    "\n",
    "#if torch.cuda.is_available():\n",
    "#    device = torch.device(\"cuda\")\n",
    "#elif torch.backends.mps.is_available():\n",
    "#    device = torch.device(\"mps\")\n",
    "#else:\n",
    "#    device = torch.device(\"cpu\")\n",
    "#\n",
    "# print(f\"Using {device} device.\")\n",
    "\n",
    "model.to(device);  # no assignment model = model.to(device) necessary for nn.Module classes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e951cd6-5e42-44d2-b21f-895cb61004fe",
   "metadata": {
    "id": "8e951cd6-5e42-44d2-b21f-895cb61004fe"
   },
   "source": [
    "- Lastly, let's calculate the initial classification accuracy of the non-finetuned model (we expect this to be around 50%, which means that the model is not able to distinguish between spam and non-spam messages yet reliably)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "fc7dd72c-73a2-4881-ade0-0a9605f1ab8c",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "fc7dd72c-73a2-4881-ade0-0a9605f1ab8c",
    "outputId": "74848515-5a49-4125-fecb-9f4bac23f812"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training accuracy: 46.25%\n",
      "Validation accuracy: 45.00%\n",
      "Test accuracy: 48.75%\n"
     ]
    }
   ],
   "source": [
    "from previous_chapters import calc_accuracy_loader\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch06 import calc_accuracy_loader\n",
    "\n",
    "\n",
    "\n",
    "torch.manual_seed(123)\n",
    "train_accuracy = calc_accuracy_loader(train_loader, model, device, num_batches=10)\n",
    "val_accuracy = calc_accuracy_loader(val_loader, model, device, num_batches=10)\n",
    "test_accuracy = calc_accuracy_loader(test_loader, model, device, num_batches=10)\n",
    "\n",
    "print(f\"Training accuracy: {train_accuracy*100:.2f}%\")\n",
    "print(f\"Validation accuracy: {val_accuracy*100:.2f}%\")\n",
    "print(f\"Test accuracy: {test_accuracy*100:.2f}%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "398a1ec9-e2a1-43d6-bf9f-12ee54b46a7b",
   "metadata": {
    "id": "398a1ec9-e2a1-43d6-bf9f-12ee54b46a7b"
   },
   "source": [
    "## E.4 Parameter-efficient finetuning with LoRA"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "652a4a82-61ef-4d0a-9858-8988e844f12c",
   "metadata": {
    "id": "652a4a82-61ef-4d0a-9858-8988e844f12c"
   },
   "source": [
    "- We begin by initializing a LoRALayer that creates the matrices $A$ and $B$, along with the `alpha` scaling hyperparameter and the `rank` ($r$) hyperparameters\n",
    "- This layer can accept an input and compute the corresponding output, as illustrated in the figure below\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/appendix-e_compressed/lora-2.webp\" width=\"200px\">\n",
    "\n",
    "In code, this LoRA layer depicted in the figure above looks like as follows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "2ds9ywjMwvIW",
   "metadata": {
    "id": "2ds9ywjMwvIW"
   },
   "outputs": [],
   "source": [
    "import math\n",
    "\n",
    "class LoRALayer(torch.nn.Module):\n",
    "    def __init__(self, in_dim, out_dim, rank, alpha):\n",
    "        super().__init__()\n",
    "        self.A = torch.nn.Parameter(torch.empty(in_dim, rank))\n",
    "        torch.nn.init.kaiming_uniform_(self.A, a=math.sqrt(5))  # similar to standard weight initialization\n",
    "        self.B = torch.nn.Parameter(torch.zeros(rank, out_dim))\n",
    "        self.alpha = alpha\n",
    "        self.rank = rank\n",
    "\n",
    "    def forward(self, x):\n",
    "        # Note: The original chapter didn't include the scaling by self.rank\n",
    "        # This scaling is not necessary, but it's more canonical and convenient\n",
    "        # as this lets us compare runs across different ranks without retuning learning rates\n",
    "        x = (self.alpha / self.rank) * (x @ self.A @ self.B)\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad21faa8-0614-4257-93cd-68952193e14a",
   "metadata": {
    "id": "ad21faa8-0614-4257-93cd-68952193e14a"
   },
   "source": [
    "- In the code above, `rank` is a hyperparameter that controls the inner dimension of the matrices $A$ and $B$\n",
    "- In other words, this parameter controls the number of additional parameters introduced by LoRA and is a key factor in determining the balance between model adaptability and parameter efficiency\n",
    "- The second hyperparameter, `alpha`, is a scaling hyperparameter applied to the output of the low-rank adaptation\n",
    "- It essentially controls the extent to which the adapted layer's output is allowed to influence the original output of the layer being adapted\n",
    "- This can be seen as a way to regulate the impact of the low-rank adaptation on the layer's output\n",
    "- So far, the `LoRALayer` class we implemented above allows us to transform the layer inputs $x$\n",
    "- However, in LoRA, we are usually interested in replacing existing `Linear` layers so that the weight update is applied to the existing pretrained weights, as shown in the figure below\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/appendix-e_compressed/lora-3.webp\" width=\"200px\">"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e6d5da0-dfce-4808-b89b-29ff333f563f",
   "metadata": {
    "id": "3e6d5da0-dfce-4808-b89b-29ff333f563f"
   },
   "source": [
    "- To incorporate the original `Linear` layer weights as shown in the figure above, we implement a `LinearWithLoRA` layer below that uses the previously implemented LoRALayer and can be used to replace existing `Linear` layers in a neural network, for example, the self-attention module or feed forward modules in an LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "127d3a64-8359-4b21-b056-78d58cc75fe8",
   "metadata": {
    "id": "127d3a64-8359-4b21-b056-78d58cc75fe8"
   },
   "outputs": [],
   "source": [
    "class LinearWithLoRA(torch.nn.Module):\n",
    "    def __init__(self, linear, rank, alpha):\n",
    "        super().__init__()\n",
    "        self.linear = linear\n",
    "        self.lora = LoRALayer(\n",
    "            linear.in_features, linear.out_features, rank, alpha\n",
    "        )\n",
    "\n",
    "    def forward(self, x):\n",
    "        return self.linear(x) + self.lora(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e1145a90-35ff-462c-820b-15483fa5b051",
   "metadata": {
    "id": "e1145a90-35ff-462c-820b-15483fa5b051"
   },
   "source": [
    "- Note that since we initialize the weight matrix $B$ (`self.B` in `LoRALayer`) with zero values in the LoRA layer, the matrix multiplication between $A$ and $B$ results in a matrix consisting of 0's and doesn't affect the original weights (since adding 0 to the original weights does not modify them)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e98a6d36-7bc9-434c-a7f1-533f26aff06d",
   "metadata": {
    "id": "e98a6d36-7bc9-434c-a7f1-533f26aff06d"
   },
   "source": [
    "- To try LoRA on the GPT model we defined earlier, we define a `replace_linear_with_lora` function to replace all `Linear` layers in the model with the new `LinearWithLoRA` layers\n",
    "\n",
    "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/appendix-e_compressed/lora-4.webp\" width=\"400px\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "WlQZ8ygqzN_g",
   "metadata": {
    "id": "WlQZ8ygqzN_g"
   },
   "outputs": [],
   "source": [
    "def replace_linear_with_lora(model, rank, alpha):\n",
    "    for name, module in model.named_children():\n",
    "        if isinstance(module, torch.nn.Linear):\n",
    "            # Replace the Linear layer with LinearWithLoRA\n",
    "            setattr(model, name, LinearWithLoRA(module, rank, alpha))\n",
    "        else:\n",
    "            # Recursively apply the same function to child modules\n",
    "            replace_linear_with_lora(module, rank, alpha)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c172164-cdde-4489-b7d7-aaed9cc2f5f2",
   "metadata": {
    "id": "8c172164-cdde-4489-b7d7-aaed9cc2f5f2"
   },
   "source": [
    "- We then freeze the original model parameter and use the `replace_linear_with_lora` to replace the said `Linear` layers using the code below\n",
    "- This will replace the `Linear` layers in the LLM with `LinearWithLoRA` layers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "dbe15350-4da9-4829-9d23-98bbd3d0b1a1",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "dbe15350-4da9-4829-9d23-98bbd3d0b1a1",
    "outputId": "fd4c208f-854a-4701-d9d3-9d73af733364"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total trainable parameters before: 124,441,346\n",
      "Total trainable parameters after: 0\n"
     ]
    }
   ],
   "source": [
    "total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
    "print(f\"Total trainable parameters before: {total_params:,}\")\n",
    "\n",
    "for param in model.parameters():\n",
    "    param.requires_grad = False\n",
    "\n",
    "total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
    "print(f\"Total trainable parameters after: {total_params:,}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "mLk_fPq0yz_u",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "mLk_fPq0yz_u",
    "outputId": "0a93b8fc-05d7-4ace-ee47-e2fc6bdd7d75"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total trainable LoRA parameters: 2,666,528\n"
     ]
    }
   ],
   "source": [
    "replace_linear_with_lora(model, rank=16, alpha=16)\n",
    "\n",
    "total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
    "print(f\"Total trainable LoRA parameters: {total_params:,}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b8b6819e-ef7a-4f0d-841a-1b467496bef9",
   "metadata": {
    "id": "b8b6819e-ef7a-4f0d-841a-1b467496bef9"
   },
   "source": [
    "- As we can see, we reduced the number of trainable parameters by almost 50x when using LoRA\n",
    "- Let's now double-check whether the layers have been modified as intended by printing the model architecture"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "1711be61-bb2c-466f-9b5b-24f4aa5ccd9c",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "1711be61-bb2c-466f-9b5b-24f4aa5ccd9c",
    "outputId": "acff8eca-3775-45a2-b62d-032a986ef037"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GPTModel(\n",
      "  (tok_emb): Embedding(50257, 768)\n",
      "  (pos_emb): Embedding(1024, 768)\n",
      "  (drop_emb): Dropout(p=0.0, inplace=False)\n",
      "  (trf_blocks): Sequential(\n",
      "    (0): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (1): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (2): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (3): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (4): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (5): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (6): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (7): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (8): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (9): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (10): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "    (11): TransformerBlock(\n",
      "      (att): MultiHeadAttention(\n",
      "        (W_query): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_key): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (W_value): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (out_proj): LinearWithLoRA(\n",
      "          (linear): Linear(in_features=768, out_features=768, bias=True)\n",
      "          (lora): LoRALayer()\n",
      "        )\n",
      "        (dropout): Dropout(p=0.0, inplace=False)\n",
      "      )\n",
      "      (ff): FeedForward(\n",
      "        (layers): Sequential(\n",
      "          (0): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=768, out_features=3072, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "          (1): GELU()\n",
      "          (2): LinearWithLoRA(\n",
      "            (linear): Linear(in_features=3072, out_features=768, bias=True)\n",
      "            (lora): LoRALayer()\n",
      "          )\n",
      "        )\n",
      "      )\n",
      "      (norm1): LayerNorm()\n",
      "      (norm2): LayerNorm()\n",
      "      (drop_resid): Dropout(p=0.0, inplace=False)\n",
      "    )\n",
      "  )\n",
      "  (final_norm): LayerNorm()\n",
      "  (out_head): LinearWithLoRA(\n",
      "    (linear): Linear(in_features=768, out_features=2, bias=True)\n",
      "    (lora): LoRALayer()\n",
      "  )\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "model.to(device)\n",
    "\n",
    "print(model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4bbc9d7-65ec-4675-bab8-2e56eb0cfb55",
   "metadata": {
    "id": "c4bbc9d7-65ec-4675-bab8-2e56eb0cfb55"
   },
   "source": [
    "- Based on the model architecture above, we can see that the model now contains our new `LinearWithLoRA` layers\n",
    "- Also, since we initialized matrix $B$ with 0's, we expect the initial model performance to be unchanged compared to before"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "DAlrb_I00VEU",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "DAlrb_I00VEU",
    "outputId": "3da44ac4-230b-4358-d996-30b63f0d962a"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training accuracy: 46.25%\n",
      "Validation accuracy: 45.00%\n",
      "Test accuracy: 48.75%\n"
     ]
    }
   ],
   "source": [
    "torch.manual_seed(123)\n",
    "train_accuracy = calc_accuracy_loader(train_loader, model, device, num_batches=10)\n",
    "val_accuracy = calc_accuracy_loader(val_loader, model, device, num_batches=10)\n",
    "test_accuracy = calc_accuracy_loader(test_loader, model, device, num_batches=10)\n",
    "\n",
    "print(f\"Training accuracy: {train_accuracy*100:.2f}%\")\n",
    "print(f\"Validation accuracy: {val_accuracy*100:.2f}%\")\n",
    "print(f\"Test accuracy: {test_accuracy*100:.2f}%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13735b3e-f0c3-4dba-ae3d-4141b2878101",
   "metadata": {
    "id": "13735b3e-f0c3-4dba-ae3d-4141b2878101"
   },
   "source": [
    "- Let's now get to the interesting part and finetune the model by reusing the training function from chapter 6\n",
    "- The training takes about 15 minutes on a M3 MacBook Air laptop computer and less than half a minute on a V100 or A100 GPU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "wCParRvr0eff",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "wCParRvr0eff",
    "outputId": "ce910a9c-ee89-48bb-bfa6-49c6aee1e450"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Ep 1 (Step 000000): Train loss 3.820, Val loss 3.462\n",
      "Ep 1 (Step 000050): Train loss 0.346, Val loss 0.325\n",
      "Ep 1 (Step 000100): Train loss 0.063, Val loss 0.144\n",
      "Training accuracy: 100.00% | Validation accuracy: 92.50%\n",
      "Ep 2 (Step 000150): Train loss 0.054, Val loss 0.045\n",
      "Ep 2 (Step 000200): Train loss 0.058, Val loss 0.122\n",
      "Ep 2 (Step 000250): Train loss 0.041, Val loss 0.199\n",
      "Training accuracy: 100.00% | Validation accuracy: 95.00%\n",
      "Ep 3 (Step 000300): Train loss 0.020, Val loss 0.153\n",
      "Ep 3 (Step 000350): Train loss 0.018, Val loss 0.202\n",
      "Training accuracy: 100.00% | Validation accuracy: 95.00%\n",
      "Ep 4 (Step 000400): Train loss 0.014, Val loss 0.134\n",
      "Ep 4 (Step 000450): Train loss 0.000, Val loss 0.145\n",
      "Ep 4 (Step 000500): Train loss 0.002, Val loss 0.205\n",
      "Training accuracy: 97.50% | Validation accuracy: 90.00%\n",
      "Ep 5 (Step 000550): Train loss 0.016, Val loss 0.153\n",
      "Ep 5 (Step 000600): Train loss 0.038, Val loss 0.159\n",
      "Training accuracy: 100.00% | Validation accuracy: 95.00%\n",
      "Training completed in 6.21 minutes.\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "from previous_chapters import train_classifier_simple\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch06 import train_classifier_simple\n",
    "\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "torch.manual_seed(123)\n",
    "\n",
    "optimizer = torch.optim.AdamW(model.parameters(), lr=8e-4, weight_decay=0.1)\n",
    "\n",
    "num_epochs = 5\n",
    "train_losses, val_losses, train_accs, val_accs, examples_seen = train_classifier_simple(\n",
    "    model, train_loader, val_loader, optimizer, device,\n",
    "    num_epochs=num_epochs, eval_freq=50, eval_iter=5,\n",
    ")\n",
    "\n",
    "end_time = time.time()\n",
    "execution_time_minutes = (end_time - start_time) / 60\n",
    "print(f\"Training completed in {execution_time_minutes:.2f} minutes.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0c89e82-3aa8-44c6-b046-0b16200b8e6c",
   "metadata": {
    "id": "d0c89e82-3aa8-44c6-b046-0b16200b8e6c"
   },
   "source": [
    "- Finally, let's evaluate the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "bawWGijA0iF3",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 308
    },
    "id": "bawWGijA0iF3",
    "outputId": "af70782a-d605-4376-fa6c-d33b38979cfa"
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAEiCAYAAAA21pHjAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjUsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvWftoOwAAAAlwSFlzAAAPYQAAD2EBqD+naQAARNxJREFUeJzt3Qd8FGX6B/DfbjohhRYgQCjSW0CaNKULIgKn4iEqcp78RUA9xMKpFD0P1FNRwX7KeaKAIJwFUDqCIL1D6CQBQoBAGuk7/8/z7s5mN4SQQJLZ8vv6GWdndnb33WEzz7zdpGmaBiIiInJJZqMTQERERNfGQE1EROTCGKiJiIhcGAM1ERGRC2OgJiIicmEM1ERERC6MgZqIiMiFMVATERG5MAZqIiIiF8ZATUROevTogWeeecboZBCRDQM1USl79NFHYTKZrlr69+9vdNKIyA35Gp0AIk8kQfnLL7902hcQEGBYeojIfTFHTVQGJCjXqFHDaalUqZJ6bu3atfD398dvv/1mP/7NN99EREQEzp07p7aXL1+Obt26ITw8HFWqVMHdd9+NY8eO2Y8/efKkyqUvWLAA3bt3R1BQEDp06IDDhw9j69ataN++PSpWrIgBAwbg/PnzTrn9IUOGYNq0aahWrRpCQ0PxxBNPIDs7+5rfJSsrCxMnTkStWrUQHByMTp06qe+gO3XqFAYNGqS+nzzfokULLF269Jrv9+GHH6JRo0YIDAxE9erVcd9999mfs1gsmD59OurXr6++U3R0NBYuXOj0+n379qnvJd9PXv/www/jwoULTkX3Tz31FJ5//nlUrlxZnfupU6cW69+NyBUxUBMZVAcsASY5ORk7d+7EK6+8gs8//1wFHpGeno4JEyZg27ZtWLVqFcxmM4YOHaoCmaMpU6bg5Zdfxo4dO+Dr64sHH3xQBaj33ntP3QgcPXoUkydPdnqNvN/BgwdVsP3222/x/fffq8B9LePGjcOmTZswb9487NmzB/fff78qMThy5Ih6fuzYsSqYr1+/Hnv37sUbb7yhgmhh5PtIEH311VcRExOjbkhuv/12+/MSpL/66it8/PHH2L9/P/72t7/hoYcewrp169Tzly9fRq9evdC2bVv1XvJ6ubkZNmyY0+f85z//UTcNf/zxh7oJks9bsWJFif+tiFyCTHNJRKVn5MiRmo+PjxYcHOy0vP766/ZjsrKytDZt2mjDhg3Tmjdvrj3++ONFvuf58+dlOlpt7969avvEiRNq+/PPP7cf8+2336p9q1atsu+bPn261qRJE6e0Va5cWUtPT7fv++ijj7SKFStqeXl5avuOO+7Qnn76afX41KlT6rucPn3aKT29e/fWJk2apB63atVKmzp1arHOzaJFi7TQ0FAtJSXlqucyMzO1ChUqaL///rvT/scee0wbPny4evzaa69p/fr1c3o+Li5Ofe+YmBh7+rt16+Z0TIcOHbQXXnihWGkkcjWsoyYqAz179sRHH33ktE+KYXVS9D137ly0bt0adevWxbvvvut0rORWJScsOUIp1tVz0rGxsWjZsqX9OHm9Ts+Nt2rVymlfYmKi03tLcXKFChXs2507d0ZaWhri4uJUWhxJDjkvLw+NGzd22i85aCmSF5JDHjNmDH799Vf06dMH9957r1O6HPXt21d9RoMGDVSuXBYpKZD0SO7/ypUr6hhHUiwvOWixe/durFmzptAcu1QN6Oks+Pk1a9a86jwQuQsGaqIyIMWuDRs2LPKY33//Xa2TkpLUIq/RSZ2vBLTPPvsMkZGRKlBLgC5Yl+zn52d/LHXWhe0rWFxeEhLAfXx8sH37drV2pAfLv/71r7jzzjvx888/q2Atxddvv/02xo8ff9X7hYSEqGJ6KXaXY+VmROqPpV5dPkvI+0h9eGEN8eQYOTdSvF6QBOPCzktpnAciIzFQExlAcn9S/yqBeP78+Rg5ciRWrlyp6qIvXryo6m/lOWkoJjZs2FBqny250oyMDNVYS2zevFkF3Tp16lx1rORkJUctuVE9LYWR10qjNFkmTZqk0l5YoBZSly45b1mkjl0azK1evVrlpCUgS6nBHXfcUehrb731VixatAj16tVT70PkDfhLJyoDUjSckJDgtE8CS9WqVVXgkwZSkgsdNWqUKv6V4mrJhT733HOq9bQUK3/66acqlyiB68UXXyy1tEmu/LHHHlON0KT1uARLaTAmNwkFSVHyiBEj8Mgjj6j0SeCWVuTSIE2KlwcOHKgaxkkrbDn20qVLqmi6WbNmhX72Tz/9hOPHj6sGZPI9pXW45HSbNGmictvSulxuYGSftHqXxnYbN25UrdPlZkYarslNwPDhw+2tuqXIXBq6SWO8grl+Ik/AQE1UBqQ1smNRrJBgdOjQIbz++uuqS5MELSHHSVCW4NOvXz9VhyyBR+p+pbhbXvf++++r1uKloXfv3qp7lARLuaGQzy2q+5L0B//HP/6BZ599FqdPn1Y3G7fddpvqMibkxkMCaHx8vAqocuNRsM5dJ7lnaWUun5eZmanSIS3PpUuXeO2111S3MSk+l4Aux0su+u9//7t6XqoBJHC/8MIL6lxJ+qWKQD6zsBsNIk9gkhZlRieCiMqH9KOWLk5LliwxOilEVEy8BSUiInJhDNREREQujEXfRERELow5aiIiIhfGQE1EROTCGKiJiIhcGAO1zezZs9VoRzL1nkzjt2XLFng6me1IhmOUvqkyxGLBLjvSfEGGeJR+vjKKlYwkpc+YpJOhL2VADOk/K31eZSANfShIncy4JKNaybmVEaxkNiN3I/16ZRpJGZRDpqOUqSJl9DBH0i9Y+hPLYCUy0peMea1PW6mTwUtkkBAZ21reRwY4yc3NdTpGhteUvsMySpcMQzpnzhy4GxnnXAZEkd+FLDKe+LJly+zP81xd24wZM9Tfowwko+P5yid98OX8OC5Nmzb17HNl9KwgrmDevHmav7+/9sUXX2j79+9XMxmFh4dr586d0zzZ0qVLtZdeekn7/vvv1exDixcvdnp+xowZWlhYmLZkyRJt9+7d2j333KPVr19fy8jIsB/Tv39/LTo6Wtu8ebP222+/aQ0bNrTPdCSSk5O16tWrayNGjND27dunZngKCgrSPvnkE82d3HnnndqXX36pvsOuXbu0u+66S4uKitLS0tLsxzzxxBNanTp11OxV27Zt02677TatS5cu9udzc3O1li1ban369NF27typzn/VqlXts1CJ48ePqxmkJkyYoB04cED74IMP1OxVy5cv19zJDz/8oP3888/a4cOH1axWf//73zU/Pz91/gTPVeG2bNmi1atXT2vdurV9BjPB85VvypQpWosWLbSzZ8/aF5ldzpPPFQO1pmkdO3bUxo4da9+W6f4iIyPVFIHeomCgtlgsWo0aNbS33nrLvu/y5ctaQECACrZCfsDyuq1bt9qPWbZsmWYymezTIn744YdapUqV1LSOOplu0HHqRXeUmJiovvu6devs50YC0XfffWc/5uDBg+qYTZs2qW25IJjNZi0hIcFpikmZ9lE/P88//7y6CDl64IEH1I2Cu5PfgUzLyXNVuNTUVK1Ro0baihUrnKYa5fm6OlBL5qAwnnquvL7oW8Y9lpmBpFhXJ0MRyvamTZvgrU6cOKHGqnY8L2FhYapaQD8vspbi7vbt29uPkePl/Mn0jPoxMlSlTOuokzGupdhYxoV2VzIGtePUlfIbysnJcTpfUhwXFRXldL5kTG99Okr9XKSkpGD//v32YxzfQz/GnX+LMsSoDImanp6uisB5rgonxbVSHFvwO/F8XU2q4KTKTqZLlao3Kcr25HPl9YFa5vqVC4njP5qQ7YKTKngT/bsXdV5kLfU7BSeekODleExh7+H4Ge5GJoyQ+sOuXbva54aW7yI3I3LjUtT5ut65uNYxchGRGa/cicxlLXWEUscns2otXrwYzZs357kqhNzIyPSf0haiIJ4vZ5JZkPpiGU9f2kJIpkLawKSmpnrsueKkHEQ3kPPZt29fqU496YlkMpFdu3ap0oeFCxeq2a/WrVtndLJcTlxcHJ5++mmsWLFCNbikog0YMMD+WBosSuCWiVkWLFhgn7rV03h9jlpmApKp8Qq2CpTtGjVqwFvp372o8yJrmafYkbSclJbgjscU9h6On+FOZDpImfVKpnKsXbu2fb98F6lGkQkvijpf1zsX1zpGWk6720VIcjbSWrZdu3Yqpyizgr333ns8VwVIca38HUkLYymRkkVuaGTGNHksOTmer2uT3LNMsSrTnXrqb8vrA7VcTORCIvPrOhZtyrbUp3mr+vXrqx+r43mRYh+pe9bPi6zlD0IuNLrVq1er8yd3ufox0g1M6o10knOQ3JbMR+wupL2dBGkpvpXvKOfHkfyG/Pz8nM6X1MNL3Znj+ZLiYMebGzkX8scvRcL6MY7voR/jCb9F+V3ItJQ8V1dPOyrfVUof9EXafUjdq/6Y5+vapDvosWPHVDdSj/1tGdKEzQW7Z0lr5jlz5qiWzKNHj1bdsxxbBXoiaWUq3RNkkZ/CO++8ox6fOnXK3j1LzsP//vc/bc+ePdrgwYML7Z7Vtm1b7Y8//tA2bNigWq06ds+SVpjSPevhhx9WXXPkXEu3B3frnjVmzBjVVW3t2rVO3UKuXLni1C1EumytXr1adQvp3LmzWgp2C+nXr5/q4iVdPapVq1Zot5DnnntOtVadPXu2W3ahefHFF1WL+BMnTqjfjmxLb4Bff/1VPc9zVTTHVt+C5yvfs88+q/4O5be1ceNG1c1KuldJTwxPPVcM1DbST07+caU/tXTXkn7Bnm7NmjUqQBdcRo4cae+i9corr6hAKzcyvXv3Vn1iHV28eFEF5ooVK6ruDaNGjVI3AI6kD3a3bt3Ue9SqVUvdALibws6TLNK3Wic3ME8++aTqhiR/5EOHDlXB3NHJkye1AQMGqL7kcnGRi05OTs5V/y5t2rRRv8UGDRo4fYa7+Mtf/qLVrVtXfQe5CMpvRw/SgueqZIGa58u5m1TNmjXVd5DriWwfPXrUo88VZ88iIiJyYV5fR01EROTKGKiJiIhcGAM1ERGRC2OgJiIicmEM1ERERC6MgZqIiMiFMVA7kFGTZFJyWVPReK5Khuer+HiuSobny/PPlcv0o54xYwYmTZqkBqefOXOmIWmQITJlKkeZRECGk6Nr47kqGZ6v4uO5KhmeL88/Vy6Ro966dSs++eQTNRMKERERuVCglgHVZfD5zz77zK0maSAiIvKK+ahlbt+BAweiT58++Mc//lGi18qUijt37lTTwJnNN3/PIROPi9OnT6siEro2nquS4fkqPp6rkuH5cs9zJbPJydSZbdu2VdOZFsXQQD1v3jzs2LFDFX0XhzQAcGwEINMr9urVq9TTpU91RtfHc1UyPF/Fx3NVMjxf7nmutmzZgg4dOrhmoI6Li1MNx2SOz8DAwGK9RiafnzZtWqFfVOYiJSIicgdnz55Fx44dVYmwy7b6XrJkCYYOHQofHx/7vry8PJhMJlWMLTlnx+cKy1FL8YXcGUnQr127drmmn4iI6EbFx8ejTp06xYpfhuWoe/fujb179zrtGzVqFJo2bYoXXnjhqiAtAgIC1KIzuo6BiIiorBkWqENCQtCyZUunfcHBwahSpcpV+4mIiLyV4d2ziIiIyIW7Zzlau3at0UkgIi8nbWVycnKMTga5OT8/v0KrcN0+UBspPSsXu+MuI9ei4fbG1YxODhGVM2lXm5CQgMuXLxudFPIQ4eHhqFGjhmokfTMYqG1WH0rE+G93onXtMAZqIi+kB+mIiAhUqFDhpi+u5N03fVeuXEFiYqLavtnuwwzUNm3qhKv1wbMpyMzJQ6Bf6RRZEJF7FHfrQVoatBLdrKCgILWWYC2/q5spBmdjMpvalYJQJdgfOXkaDpxlty8ib6LXSUtOmqi06L+nm23zwEBtI8Vc0bZctdRVE5H3YXE3ueLviYG6kOJvBmoiInIVDNQO9Bz1LgZqIvJi9erVw8yZM0vUtVZyj2XdYn7OnDmqJbW3YaB2EF07TK1PXryCy1eyjU4OEVGRJDgWtUydOvWG3ldmNBw9enSxj+/SpYuaZCIszHoNpdLFVt8Owiv4o37VYJy4kI7d8cm4g920iMiFSXDUzZ8/H5MnT0ZMTIx9X8WKFZ26DEnr9uvNfSyqVSvZtc/f31/1F6aywRz1NXLVu2JZ/E1Erk2Co75IblZy0fr2oUOH1JwKy5YtQ7t27dSERhs2bMCxY8cwePBgNb2iBHKZC3nlypVFFn3L+37++edqxkNpydyoUSP88MMP1yz61ouof/nlFzRr1kx9Tv/+/Z1uLHJzc/HUU0+p46RLnEzGNHLkSAwZMqRE5+Cjjz7CLbfcom4WmjRpgv/+979ONydSqhAVFaW+f2RkpPpM3Ycffqi+i0y1LOfjvvvugytioC7A3vI7noGaCN4+aEV2riFLac4+/OKLL2LGjBk4ePAgWrdujbS0NNx1111YtWoVdu7cqQLooEGDEBsbW+T7TJs2DcOGDcOePXvU60eMGIGkpKRrHi8DfvzrX/9SgXP9+vXq/SdOnGh//o033sDcuXPx5ZdfYuPGjWo2RJn+uCQWL16Mp59+Gs8++yz27duH//u//1OzMK5Zs0Y9v2jRIrz77rv45JNPcOTIEfX+rVq1Us9t27ZNBe1XX31VlUIsX74ct99+O1wRi76LaPktfyzsrkHknTJy8tB88i+GfPaBV+9EBf/SuTxLIOrbt699u3LlyoiOjrZvv/baayrgSQ553Lhx13yfRx99FMOHD1eP//nPf+L999/Hli1bVKAvjPQd/vjjj1VuV8h7S1p0H3zwASZNmqRy6WLWrFlYunRpib7bv/71L5WuJ598Um1PmDABmzdvVvt79uypbg6kdKFPnz5q7G3JWXfs2FEdK8/JjI133323KnmoW7cu2rZtC1fEHHUBzWqGws/HhIvp2Yi/lGF0coiIbkr79u2dtiVHLTlbKZKWYmcplpbc9vVy1JIb10mACw0NtQ+RWRgpIteDtD6Mpn58cnIyzp07Zw+aQkbukiL6kjh48CC6du3qtE+2Zb+4//77kZGRgQYNGuDxxx9XNyRS5C7k5kWCszz38MMPq9y9lAK4IuaoC5ChQyVY74lPVt206lTmSEVE3ijIz0flbI367NIiQdWRBOkVK1aoXGfDhg3VUJdSN5udXXRPF8mROpLSRovFUqLjS7NIvzjq1KmjirWlDl6+s+S833rrLaxbt07lonfs2KHq13/99VfVEE/qs6XFu6t1AWOOuhAc+ISIJLBI8bMRS1lWuUl9sBQXS5Gz1NdK0fDJkydRnqThmzTekqCokxbpEjhLolmzZur7OJLt5s2b27flRkTq4KWoXoLypk2bsHfvXvWctICXYvE333xT1b3LeVi9ejVcDXPUhYiuLYH6FAc+ISKPI62cv//+exW85IbglVdeKTJnXFbGjx+P6dOnq1x906ZNVZ31pUuXSnST8txzz6kGblK3LAH3xx9/VN9Nb8Uurc/lBqBTp06qKP7rr79WgVuKvH/66SccP35cNSCrVKmSqh+X8yAtx10NA3URLb/3nUlGTp4Ffj4seCAiz/DOO+/gL3/5ixqkpGrVqqpblLS4Lm/yuTK16COPPKLqp2WAlTvvvLNEs0wNGTIE7733nirGl9bf9evXV63Ie/TooZ6XImxp8S6NzCRgSwmCBHPpDibPSVCX4u7MzEx1A/Ptt9+iRYsWcDUmrbwrDUpRfHy8qoOIi4tD7dq1b+7NcrOAUxuBC0dh6fA4ol/9FamZufj5qW5oEcnRdog8mVyoT5w4oS700qeWyp/kZqUoW3LI0hLd039X8SWIX8wq6jIuAf8dCix7HubsFFvxN8f9JiIqC6dOncJnn32Gw4cPqzrjMWPGqKD24IMPGp00l8NArQupAVSqL8McAHFbEV3HmotmgzIiotJnNptVHbKMjCZdqiRYS92y5KrJGeuoHUV1Bi6dAGI3oU0d64D0u+OSjU4VEZHHkWLfgi22qXDMUTuq29m6jt1sH/P7cGIq0rKsHeSJiIjKGwN1wRy1OL0NERVMiAwLhDS12xvPXDURERmDgdpRlYZAhSpAbiZwdjfaRHGCDiIiMhYDtSPpaK/nqmM35bf85pSXRERkEAbqgqJuy6+n5pSXRERkMAbqguw56s1oFRkCswk4m5yJcymZRqeMiIi8EAN1QTWjAd8gICMJwakn0Lh6iNrNgU+IyFPJkJvPPPOMfbtevXqYOXNmka+RMbmXLFly059dWu9TFBkmtE2bNnBXDNQF+fgBtW3zt5763V5PzYFPiMjVyMQa/fv3L/S53377TQVBmRWqpGRWKxl7uzyC5dmzZzFgwIBS/SxPw0B9neJvtvwmIlf12GOPqXmWZdzogmRyivbt26N169Ylft9q1aqp2abKg0yzGRAQUC6f5a4YqAtT/3agQU+Vs9Zz1HvikmGxuO38JUTkge6++24VVGUoTkdpaWn47rvvVCC/ePEihg8fjlq1aqngKzNIySxRRSlY9H3kyBE1HaRMLCFzPcvNQWGzYTVu3Fh9RoMGDdT0mTk5Oeo5Sd+0adOwe/dulcuXRU9zwaJvGUq0V69eajpKmeVq9OjR6vvoZC5tmTVLZsyqWbOmOmbs2LH2zyruBCCvvvqqmgxDbhIkp798+XL789nZ2Rg3bpx6f/nOMi2mTMkpZB4rKR2IiopSr42MjMRTTz2FssQhRAtTv7t1AdA4z4IgPx+kZuXi+IU0NIyw1lkTkZfITi/5a3wCAB/b5TUvF8jLAkxmwC/o+u/rH1zsj/H19VXTRErQe+mll+xzOUuQlmkdJUBLkGvXrp0KpKGhofj555/x8MMP45ZbbkHHjh2LFdT+9Kc/oXr16vjjjz+QnJzsVJ+tCwkJUemQwCXB9vHHH1f7nn/+eTzwwAPYt2+fCob6XNFhYVfPSpienq6muuzcubMqfk9MTMRf//pXFTQdb0bWrFmjgqisjx49qt5fgq18ZnHI1Jhvv/02PvnkEzWX9RdffIF77rkH+/fvV9Ndvv/++/jhhx+wYMECFZBlhitZxKJFi/Duu+9i3rx5akpMmapTbkDKEgP1dfj6mNGqVhi2nEzCrrhkBmoib/PPyJK/5v45QIuh1seHfgS+exSo2w0Y9XP+MTNbAVcuXv3aqSUbCVHmln7rrbewbt06+zzMUux97733qmAoy8SJE+3Hjx8/Hr/88osKQsUJ1BJYDx06pF4jQVj885//vKpe+eWXX3bKkctnSjCTQC2544oVK6obCynqvpZvvvlGTQ351VdfITjYesMya9YsVRf/xhtvqJsFUalSJbVf5q5u2rQpBg4ciFWrVhU7UEtuXG5c/vznP6tteW8J+lKKMHv2bMTGxqqA3a1bN3XzIzlqnTwn36FPnz7w8/NTgbw45/FmsOi7KGmJQMJe+0xau+IuGZ0iIiInEqi6dOmicoVCcpjSkEyKvYXkrGV+Zynyrly5sgqYEnQl4BTHwYMH1QQaepAWkuMtaP78+WoWLAli8hkSuIv7GY6fFR0dbQ/SomvXripXHxMTY98nOVkJ0jrJXUvuuzhSUlJw5swZ9b6OZFs+Xy9e37VrF5o0aaKKtX/99Vf7cffffz8yMjJU8b7cGCxevBi5ubmem6P+6KOP1HLy5En7yZ88ebJrtACMWQ58+wBQozWiu8xXuziTFpEX+vuZGyv61jUdZH0PKfp29MxelBYJypJTltyg5KalWPuOO+5Qz0luW4p6JbcowVqCoBRdSz1sadm0aRNGjBih6qGl6Fpy8ZKbluLlsuDn5+e0LbleCeal5dZbb1VzYy9btkyVKAwbNkzloBcuXKhuWuSmQfZLXf2TTz5pL9EomC6PyFFLRf6MGTOwfft2bNu2TTUgGDx4sKoncIn+1JD6Hg1tIq13dwfPpiAzJ8/olBFReZI645Iuev20kMeyz7F+uqj3vQESSGR+Zyk6lmJjKQ7X66tlKkm5rj700EMqtyo5wcOHDxf7vWV+aKmflW5Uus2bNzsd8/vvv6viYaknl5bmUmx86tQp56/r769y99f7LKnvlbpq3caNG9V3k9xtaZB6eikdKDjFpmxLQznH46Tu+7PPPlOlBVI3nZSUpJ6Tonwpjpe67LVr16obFamX98gctXxRR6+//rrKYcuPQHLXhgqtCbx4CggMQy1NQ9WK/riQlo39Z1LQrm4lY9NGRORAipolqEyaNEkV7UrRrU6CpuQEJZhK3e4777yDc+fOOQWlokhOUlpzjxw5UuUc5f0lIDuSz5BibslFd+jQQTVYkyJhR1JvLblUKVKWTJo0NCvYLUty5VOmTFGfJS2rz58/r0oKpPGbXj9dGp577jn1OVLyII3QpBRC0jV37lz1vJwjKU6XhmZykyCN86RIPzw8XDVqkxuOTp06qRbuX3/9tQrcjvXYHltHLV9c/pHlTqqw+g+RlZWlfiT6kpqaWraJCrTWTcudKQc+ISJXJsXfly5dUkXPjvXJUlcsRbmyXxqbScCR7k3FJYFKgq7Uy0qjKWmFLZkqR9Ji+m9/+5tqnS2BT24KpHuWI2ncJoOz9OzZU3UpK6yLmAQ+qT+XnKsE/Pvuuw+9e/dWDcdKk9Q7T5gwAc8++6yqDpDW6NLKW244hNxEvPnmm6p0QNIh1bNLly5V50KCteSypU5b+qhLEfiPP/6ouomVFZMmncIMJMUFEpilpZ/cFUrRzV133VXosXKHJXUgBUmxjNyhlRlLHj5YcxxvrziMwW0i8d6f25bdZxFRuZPrj+T26tevr/rNEpX170oGqZH67uLEL8Nz1FLvIEUO0j9vzJgxqsjjwIEDhR4rxTrSh09frnVcqbmSBHx5F/BmA7SJtI7SwzG/iYioPBnej1oaGDRs2FA9lk750sldWihKR/SCpD7DsU5Dir/LVFAlIPEgkHkZbf2s3QxOXbyCS+nZqBTsX7afTURE5Ao56oKkib3URbsEaTVpG/e74rmtaFDV2iKT434TEZFXBGopyl6/fr2qqJe6atmWpu7S8s9lRN1mXcduRnQda4MyFn8TEZFXFH3LSDIyTq30z5MO8tKCTlr89e3b18hkXWMmrU2I7hqCxTvZ8puIiLwkUP/73/+Gy5OBT3yDgIwkdAqzdnbfHZ+sZlDRBxQgIs9QmqNbEVlK6fdkeGMyl+frr6a7xMnf0DBjL/x8aiApPRtxSRmIqlI+87USUdk3apU+sjIGtPTxlW3eiNONkoycDNEqA7bI70p+TzeDgbq49dQnf4Pf6S1oXvMhlaPeFX+ZgZrIQ8jFVPq6SjWcBGui0iADuMjsWvL7uhkM1CVqULYJbeqNU4Fa6qnvib6B6e+IyCVJrkcuqjIT0vXGpCa6HpndS6b1LI2SGQbq4qjd0TrzzaWT6NQhG/9hy28ijyQXVZkBqaxmQSLyiH7ULikwFKjeUj28FYfUet/pZOTkseEJERGVLQbqEnbTqn55J0IDfZGVa0FMQhlPCkJERF6PgbqE9dSm+G0c+ISIiMoNA3Vx3dITGLUcGLWMU14SEVG5YWOykkzQUdda/N3GlqPmmN9ERFTWmKO+Aa3rhKn1kcQ0pGbmGJ0cIiLyYAzUJZF0HPh5IiLWvoBa4UHQNGDv6WSjU0VERB6Mgbok8nKArZ8Bu+ejXW3blJdxDNRERFR2GKhLompjoMtTwNCP0bp2qNq1K+6S0akiIiIPxsZkJSFDwfV7TT1sdfwigOPMURMRUZlijvoGtaodBrMJSEjJREJyptHJISIiD8VAXVJ5ucDJDaiwZRYaR1RUuzjwCRERlRUG6pLSLMDX9wIrp6BPdesQouxPTUREZYWBuqR8/YFa7dXD2wOOqDVHKCMiorLCQH0T4343ztqn1nvik5Fn0QxOFBEReSIG6puYSSvs/HYE+fkgLSsXx8+nGZ0qIiLyQAzUN6JOB+mrBdOlE+heM1ftYoMyIiIqCwzUNyIwDKjeUj3sH3pSrRmoiYioLDBQ32Q9dVvtkFqz5TcREZUFBuobZZvyslbqbrU+dDYVmTl5BieKiIg8DQP1japjzVH7nd+HqGALci0a9p/hcKJERFS6GKhvVFgtIDwKJs2CIVXj1a5dHPebiIhKGQN1KXTT6hZwVK058AkREZU2BupSHPiELb+JiKi0MVDfjLpdgXrdEdikt9qMTbqCpPRso1NFREQehIH6ZlRrAjz6EwJ7PY8G1YLVLnbTIiIiwwN1XFwc4uOtDajEli1b8Mwzz+DTTz+Ft2pTO1ytWU9NRESGB+oHH3wQa9asUY8TEhLQt29fFaxfeuklvPrqq/A6V5LQK1Rv+c1ATUREBgfqffv2oWPHjurxggUL0LJlS/z++++YO3cu5syZA69yZhfwZn303z1eJqtWOWpN40xaRERkYKDOyclBQECAerxy5Urcc8896nHTpk1x9uzZYr/P9OnT0aFDB4SEhCAiIgJDhgxBTEwM3EpEM8A3CObgKqjhk4ZLV3IQl5RhdKqIiMibA3WLFi3w8ccf47fffsOKFSvQv39/tf/MmTOoUqVKsd9n3bp1GDt2LDZv3qzeR24A+vXrh/T0dLgN3wBgYgzM47ehemQdtWtn3CWjU0VERB7C90Ze9MYbb2Do0KF46623MHLkSERHR6v9P/zwg71IvDiWL1/utC3F5pKz3r59O26//Xa41WxaMkFHnXBV9L07LhmD29QyOlVEROStgbpHjx64cOECUlJSUKlSJfv+0aNHo0KFCjecmORk6xCclStXhjtqU4tdtIiIyAWKvjMyMpCVlWUP0qdOncLMmTNV/bLkiG+ExWJRXby6du2qGqcVRj5Tbg70JTU1FS7Bkgd8NRj3LLsN1XAZ+04nIyfPYnSqiIjIWwP14MGD8dVXX6nHly9fRqdOnfD222+rxmAfffTRDSVE6qqlNfm8efOKbHwWFhZmX5o3bw6XYPYB0i/CnJuB2wOPIivXgpgEF7mJICIi7wvUO3bsQPfu3dXjhQsXonr16ipXLcH7/fffL/H7jRs3Dj/99JPqm127du1rHjdp0iRVPK4vBw4cgKuN+9234gm1Zn9qIiIyLFBfuXJFdakSv/76K/70pz/BbDbjtttuUwG7uKS/sQTpxYsXY/Xq1ahfv36Rx0uXsNDQUPuip8GVAnUb7aBaM1ATEZFhgbphw4ZYsmSJGkr0l19+UV2qRGJiogqgJSnu/vrrr/HNN9+ooCujnMkideDuOuVl9SuHEYwMDiVKRETGBerJkydj4sSJqFevnuqO1blzZ3vuum3btsV+H6nPliJsaUVes2ZN+zJ//ny4nbBaQFgUTJoFbcxHcfR8GlIzc4xOFREReWP3rPvuuw/dunVTo5DpfahF7969Vf/q4vK4oTbrdgb2xKJX0HFsTG+FvfHJ6NKwqtGpIiIib5zmskaNGir3LKOR6TNpSe5ahhH1WrZ66m7+h9V6F/tTExGREYFa+jzLLFnSRapu3bpqCQ8Px2uvvaae81q2euoGWQfhi1zWUxMRkTFF3zKd5b///W/MmDFDDVAiNmzYgKlTpyIzMxOvv/46vFLVJkBgOPwyL6O56RR2xVlHKiMiIirXQP2f//wHn3/+uX3WLNG6dWvUqlULTz75pPcGarPZWvx9eDk6+hzG5ym3ICE5EzXCAo1OGREReVPRd1JSUqF10bJPnvNqtnrqHkHH1Jr9qYmIqNwDtbT0njVr1lX7ZZ/krL1aVBe1aoUj0q6dgZqIiMq/6PvNN9/EwIEDsXLlSnsf6k2bNqkBUJYuXQqvFtkGGPkjViREAP87xgZlRERU/jnqO+64A4cPH1Z9pmVSDllkGNH9+/fjv//9L7yabwBQ/3a0rB+pNveeTkaexcP6ixMRkWvnqEVkZORVjcZ2796tWoN/+umn8HaNIkJQwd8HaVm5OHY+DY2ru9C45ERE5PkDnlAR0hLh8+tL+LKCdSYx1lMTEdGNYqAuCz7+wOYP0SlzI6rhMuupiYio/Iu+qQhB4UDPv2NXWjiu/BbAHDUREZVPoJYGY0WRRmVkc8fziLicgfTfVuNQQioyc/IQ6OdjdKqIiMiTA7WM7X295x955JGbTZPHqBkWiGohATifmoX9Z5LRrm5lo5NERESeHKi//PLLskuJp9E0mOK24IWQ5Zic2hk7Yy8zUBMRUYmxjrqsmEzAor/ivuRYfG+uht3xDYxOERERuSG2+i6Hcb87mg+x5TcREd0QBupyCNTtTTGITbqCi2lZRqeIiIjcDAN1WYqyjoPezucYfJGLPfHJRqeIiIjcDAN1WarWFAgMRxAy0cwUy/7URERUYgzUZclsthd/dzDHMFATEVGJMVCXNXugPoTd8ZehaZxJi4iIio+BupzqqTuYD+PylWzVqIyIiKi4GKjLWmRbwCcAVU3JqGdKYPE3ERGVCAN1WfMNAGrdqh6ynpqIiEqKgbpc+1Mf5sAnRERUIgzU5SGqi71B2b4zKcjOtRidIiIichMM1OWhTkdoUZ2xxnwbcnNzEZOQanSKiIjITTBQl4egcJj+shzrosbCAjN2xbP4m4iIioeBuhy1qROu1rtiGaiJiKh4OM1lOWpX3Yx2phjsjq9odFKIiMhNMFCXl9QE3P59e3T1NyH6/OdIycxBaKCf0akiIiIXx6Lv8hJSA6bQWkgwR6AmLmAvZ9IiIiJXD9Tr16/HoEGDEBkZCZPJhCVLlsCjjd2MGY2+xVGtNgc+ISIi1w/U6enpiI6OxuzZs+EVAkLsDco48AkREbl8HfWAAQPU4k0kUJthwe7YJDWTlpQkEBERXQvrqMtZmy0TsTvgcUSkxyAhJdPo5BARkYtzq1bfWVlZatGlprrfCF++OWkIMWWoCTqk+LtmWJDRSSIiIhfmVjnq6dOnIywszL40b94cbjtBhzkGO1lPTUREnhSoJ02ahOTkZPty4MABuJ2ozmqlctSxl4xODRERuTi3KvoOCAhQiy4lJQVuJ7ItLGZ/VLMkI/n0YeRZOsPHzAZlRETkgjnqtLQ07Nq1Sy3ixIkT6nFsbCw8ll8gTLXaqYct8g7gaGKa0SkiIiIXZmig3rZtG9q2basWMWHCBPV48uTJ8GSmutZ66g4y7jfrqYmIyFUDdY8ePVRf4oLLnDlz4NFs9dTSoIxTXhIRkcc0JvMYdTqq1S3mszhx8qTRqSEiIhfGQG2EoErIqdJUPQy/sAMZ2XlGp4iIiFwUA7VBfOt1UetbTYew/wxn0iIiosIxUBvEVLeLvT81Z9IiIqJrYaA2eISyZqZY7DuVaHRqiIjIRbnVgCceJbwO9vX+CsN/zkb4mStGp4aIiFwUc9QGiupwF9JMFRCXlIGLafmTjRAREekYqA0UGuiHW6pVVI93sz81EREVgoHaSFlpeMnvW3zj9w/sjr1odGqIiMgFMVAbyS8IXZN/QhefA0g6vtPo1BARkQtiYzIjmX1wof0EvL3+HLYmBKrhU00mzqRFRET5mKM2WNU+z+BHcw/EZgbi1EW2/iYiImcM1Abz9zWjZWSoesyBT4iIqCAGahfQv3ICHvNZiuPHDhudFCIicjGso3YB9yZ+gCp+O/H+qWoAehqdHCIiciHMUbsA33rW+alrJu9Cdq7F6OQQEZELYaB2AaFNblfrW3EIhxJSjE4OERG5EAZqF2Cq00mtbzGfxaFjx41ODhERuRAGaldQoTIuVGigHqYf2WB0aoiIyIUwULuIrJod1briuW1GJ4WIiFwIA7WL1VM3ytqHlMwco5NDREQugoHaRYQ07q7WLUwnsf/kWaOTQ0RELoKB2lWE1cEl32rwM+XhuyVL8Nn647jAOaqJiLweA7WrMJmQU8va+nt6xqtovXI4npv+Dv7vv9uw+tA55OaxfzURkTfiyGQuJKLnk7As2o6A1LPoZDqET3Nz8Mv+c2rpWTEWoyMOou5tQxDZupfRSSUionLCQO1K6nWFecJB4OIx4NQGTKrWF/X3JOP7nafRLnMTOp/5HxZ9dxzzNwVhWPs6uKtlBCqcWgtEdQICw4xOPRERlQEGalcj81FXbaiWhgBejqqF5/s3xe41Sdiw6wpWXW6BLSeS1PLtD/FYZHoemskM1GgFU91uKtgjqrPqm01ERO6PgdpNpsLs0PcBoO8DuCU5Ay12nMaCbXEIvHQJJ3yro775HHB2t3XZPFuiPVC9BVC3qzVwyzq4qtFfg4iIboBJ0zQNbio+Ph516tRBXFwcateuDW9isWjYcjIJC7bGYdu+/WiTdwCdzAfRyecgGprOXP2Cak2Bul2A+rcDLYYakWS6FksekH4eSD0LpJy1rmXJuAwEhFirNeTfrFJd6/FXkoD0C9abL3ctOZHLzuVYIDMZyLxs/a6Oa9mvHicDfkFAxQigYnWgyQBVeqTk5QCaBfANMPrbuK+8XODKReDKBes6vG7+7yzxELBuhvX3N+i9/Nesfh1ISwCCKhW9+FWwlhDSTccv5qjdlNlswm0NqqglZXAL/Lj7DBZsi8dLcZdRFcnoaD6InoGH0SPgCKplHAPOH7Iu8VudA/XRlUC1ZkBYLSO/jufKTAEsufkBVQLPqtfyg3FqgnXR8op+n1rt8i+g+xYBSycCzQYBD3ydH/hmdwL8g60XVqclFAgMz98OCM1/HFwN8PW/sZuLQoOsbV/zwUBl67C4iFluveDLdxj4dv57fHCr9dyURHhUfqA+thr4ZhggVT6jfs4/Zv1bgMnHGthVgLcF+QpVAR8vuuSdWA+knLHe1OmBON0WlNW+i9Z/K0d9XwO6PmV9nJMO7F8MhBa4Nhz6GUjcf/3P9/HPD9rtHwM6jc6/0dz2hfW3125k/vGSJnmN3JwywDvxol+t5woN9MOITnXVIrNvLdgaj8U7q2LplduAK0AlpGBEzdMYHH4C9Ro0gZ/+wpwM4NvhQF428NQuoHL9/D8k+WPxsR9JBeVmW3MVjjlgWTqOBsJsd8cb3gVWTgXaPAQMmW27w/IDtn529ftJO4PgCCC0JhBiW+QCl51uDX76e+ok8MrzupwrwIWYkn+PB+YCze7OD6hr/wnU6w7c+Xr+91z0F4dgbAvEWdeZ5a3yLfmBWi74Z3YCfsEO39cEVKwBWHJs3yW8kLXtZiL7CpCeCKSdAyKa5b9HWqJ1LTluRxvfv0b6TECFKrYAXi0/kMt5b9QPiGiafxMix5rLqPeqxeL83vLvK6UDctNiX/Kct+VvVQVaW9Ct3wOo3c76+rgtwMLHrL+dx37Nf9+fJgAXjxQjQSbrb0lKaPwr5O+uVB/oPwMIqeF8uATy5Djrb0KuFRmXrl7k31WuK/JvJovjv0fKaWD1a9bz7hioF4xUjWjVTVahufRw63NSiqIvDXrk/37TzgMrp1iD/aCZzjduCXttr9Gsazm/ju9jX2zPN+oLdJ9gfX1WGjDnLuv+v66+sRvbm8RA7WGa1gjF5EHN8cKAJlh1MBHzt8Zh/RFg1tlQzDrbDCHHfTHowl7Vajw6+BJMUpctP/BK9fLfZNFfgeNrgfA61v3yBytrCeT6tuTSPJkEqONrri6O1h/LxbIw9e/ID6pyIRKOuRa5EPaYZA0YekCWC6wcW9zcXsfHrYsjuTg9utR60ZeLosrZFlgK2+/475gSb23nEFbH4X39gJhl1875SvDVg6sEVf1xaGT+MVFdgOHznfeJCcXIlRWl7UNA04HWgOAYBNv/xRrEJUCoAJ9orVqQC63KWV4AbDHeToKRHqiPrADmPWgNAg9/n3/M3GHWG6eiAqqUjDjuk0DXepjtfVcCc+8DarYG/m99/vt+3B24fKpk372vb36gln+j5Nir/43qdLSWlElJgvzeggusZb88liBo9rn6M6QU6LYxV++P/nPRaZNgJ+fJMXDLtUQnpT5tH776Bis7zfb6vPx/p+vxr5AfqOX1u+YC/hWdA/WpTcCxVSgR/SbTmiDr34WeNgMwUHuoAF8f3NWqplrOXM7Aou3xWLA9DnFJGfjmj1i1NKkegvvbf4GhrauhimNRkwQi+UFeOmldsPbqDwiq7By45Q406ja4tJzM/IArOTM9R3rwR+D3WdZubn1fte6Ti7oUqxZFgqNc4EMirWsJRBJ4dS3/BLQYYr0wOerxYml/M+vFWhoOlkTB5imNB1jrKB1z6vK7uHum9aLqlNO1rYuTu5AbEVlKm6StYB295FT7Trv6WAmekvtzDN4qt2d7XLVx/rGyX37/5gKXx9hN1y9JKEhKOhzTJhd9uZlwSrMtSEpuUT7Tvsi2bZ/UwzsGV2lzoqvaBHhs5dUNRod8CEPIv0tAReviGKAdg+DgWVfv/7911pKDwnLoGbZFfrNS+qQvjtcc+d32mQr4FGizIDe10rbB8XWFLqb8x1LFovMNAkYstD4vf/MGYGMyL2uAtvnERXy3LR5L955FVq71guHnY0KfZtUxrEMd3N6oGnzkYiIXq0snrIE6ybbWtyV3UlDvKflFRedjgHkjrDmH+77IP+ZynPVi41i8VhrkJyzFglIfp3K9BdcSnM9Y/9B1IxYBjfpYH+/6BlgyBmjQE3hkSf4xXw60BlnH4mgVjG3BWYIE69I8jxRDy29cgrtjoDnwP+sNnB5IVWAtIrjKIr8V/cYnN8taXCw3N443Q9KgS17D35JXiS9B/HKJQD179my89dZbSEhIQHR0ND744AN07Gid9rEoDNQ3Ljkjx9YALQ574pPt+2uEBqJv8+qo4O+jGqz5mEz2tY/Z2ogtMO8KwrPOICzzNEIz4xGaEY/YyAFIqtoeZpMJtRPXoPOW8bgc3hy/9/5e7fMxm9BleX8EpxxHVlAEsirWQVZIXWSHRiE7tC5yQqOQF1oPluCq8JEPssVfi6ZZq400wP9SDCrGrUFOUHVcbDBY5U603GzcOrcVzBaH4s8i5PkEICuwOg5G/x3nI3vAogGB6acRmrQHacF1kRzW1Pp50OBrNsPPxwx/X5Na64u/2iePrfutjx22fczqPBEReUSgnj9/Ph555BF8/PHH6NSpE2bOnInvvvsOMTExiIiw1fFdAwN16Th4NkUF7MU7T+PylZufYjMUaWhhPgUzLNhosbXQhYYtAWMRYSrQyrSAdC0AsZr1372G6RL+lvMk1lraqO0/mdfjHf+PsSGvBR7Kecn+mh0Bo1HZlIbzWijOaZWRoFWyrxNQGedkbdtOgRRDl30Q9TXrwd3kEMjzg7q/Lairxde67Xicfozsk2oMWVsfW9f6zYI8F1DwOcfX+JgR4Gdb+5phYq6NvJimacjJ05CdZ0FWTp5tbVGli9m5ss6zrfUlf1tfd29UFS1rhXlXoJbg3KFDB8yaZa2zsFgsKvHjx4/Hiy8WXZfHQF265Ee58kAi9py+jLw8DXmaporLZS1zguiP8/dZc7t5hey36GsLnPYF5aYiIvcMqucloEbeWdSwJCDSkoCalgRE4CJ84Fx/97rPE/jBp6/KlTexHMODeT/isLkBFvgPUSWFsr+KJQmp5hDkmf0hGVmT/CdrkzyyVg3KI5XJlRICdYz1tU7H2bZlLXItFvVHnZNn/SNV6zwLcnK1/MdqsX4/dyA3DoUFcedAnx/85YbDetqs50T93xbr9fOs78p/bDtWnVv9k4txrH6kw82Edb/t30z/tylkn57Ggsfr+8wF/o1R4N/b8XdQ2PG+PlIqZIaflC7ZbsJkLfv1x35m29qnkGPkOR/nYzzppknCiPwNqL8XiwW5eZqaSCjHYlvLtm2//M3kqmPztwt73vF1Oer9JMDmqcCaH2Dzg619X6FBOM+6zrNc1TSjpF4b0hIP32brKukN/aizs7Oxfft2TJo0yb7PbDajT58+2LRp01XHZ2VlqUWXmppabmn1BnIBH9i6ploMa2kt3T6kLlyEROKlSnXxkr0xVm8Ao9EPwDi4FutFSg/kDgHeIchbA75+YbIuWQ7H5t8Q5N8cON35O9zZO97965/h/Fz+6x1Z3zsX4AyqhpObCV8f682QXgJTMMBbn7NuC6mS0ekBxzHw6A8Ly385H6cV67X2R3JDrmmFBlo9kLoj/wI3qflrH/t2wX0NqhZoHFoODA3UFy5cQF5eHqpXr+60X7YPHTp01fHTp0/HtGmFtOgkzyCNbKrcYl3cjFxIfcw+CPQrpJuLgaSUQwVyPfBfI9hbcyEFj8uzl5roF/f8x7a1rT5ff4wCF3vH44t6H9nh+J6Ox+mvlffUbN9Jre3tF6zb9jYNts93bOMgBR7295PSH9v7yvOwPSelP2ptaxOhv69eGiSBSc816jdmsla5P3tuUJ7TA1jRQUw+R51reCYpMJASBP2Gw1pCY92WmxF1E2KrHnJ8bH3Otl/2mW03LT75pUB6lU+hAdbhmGsFYfXYjdqSuFX3LMl5T5hga1kM4PTp02jevLmhaSJyZarxnwveQHgba0B3COB60JfqIltxsWPw128G9OJjuTnJr3LI51RN4LRf3+dcjVDYsSjmsdZcvjXAqmBqK8rXA68eWB1LB8gDAnXVqtLC1wfnzp1z2i/bNWrUuLpoNiBALbqUlBL2aSQiMrDEJcCtskbkKspojLzi8ff3R7t27bBqVf6oMdKYTLY7d+5sZNKIiIhcguH3d1KUPXLkSLRv3171nZbuWenp6Rg1apTRSSMiIjKc4YH6gQcewPnz5zF58mQ14EmbNm2wfPnyqxqYEREReSPDA7UYN26cWoiIiMiF6qiJiIjIDXLUN0oanomzZ88anRQiIqJi0+OWHsc8NlDr3bqKM4EHERGRK8axqCiHaTVdcazvm5Gbm4udO3eqhmcy9OjNkiFJZQCVAwcOICQkpFTS6A143m4cz92N4Xm7cTx3rnHeJCctQbpt27bw9fX13EBd2mQAlbCwMCQnJyM0NNTo5LgNnrcbx3N3Y3jebhzPnfudNzYmIyIicmEM1ERERC6MgdqBjCM+ZcoUp/HE6fp43m4cz92N4Xm7cTx37nfeWEdNRETkwpijJiIicmEM1ERERC6MgZqIiMiFMVDbzJ49G/Xq1UNgYCA6deqELVu2GJ0kl7d+/XoMGjQIkZGRMJlMWLJkidFJcgvTp09Hhw4d1KAJERERGDJkCGJiYoxOllv46KOP0Lp1a9WPVRaZt37ZsmVGJ8vtzJgxQ/3NPvPMM0YnxeVNnTpVnSvHpWnTpuWaBgZqAPPnz1fzYkuLvh07diA6Ohp33nknEhMTjU6aS5N5w+VcyU0OFd+6deswduxYbN68GStWrEBOTg769eunzicVrXbt2irIbN++Hdu2bUOvXr0wePBg7N+/3+ikuY2tW7fik08+UTc8VDwtWrRQY3Pry4YNG1CupNW3t+vYsaM2duxY+3ZeXp4WGRmpTZ8+3dB0uRP5KS1evNjoZLilxMREdf7WrVtndFLcUqVKlbTPP//c6GS4hdTUVK1Ro0baihUrtDvuuEN7+umnjU6Sy5syZYoWHR1taBq8PkednZ2t7s779Olj3yfjhsv2pk2bDE0beQcZklBUrlzZ6KS4lby8PMybN0+VREgROF2flOQMHDjQ6XpH13fkyBFVxdegQQOMGDECsbGxKE9uPXtWabhw4YL6g5eJPRzJ9qFDhwxLF3kHGZhf6gm7du2Kli1bGp0ct7B3714VmDMzM1GxYkUsXrxYTZZARZObGqnak6JvKj5pszRnzhw0adJEFXtPmzYN3bt3x759+8ptUhOvD9RERudw5A++3Ou83JhcMHft2qVKIhYuXIiRI0eqen8G62uLi4vD008/rdpESINZKr4BAwbYH0u9vgTuunXrYsGCBXjsscdQHrw+UFetWhU+Pj72ua11sl2jRg3D0kWeb9y4cfjpp59U63lpJEXF4+/vj4YNG6rH7dq1UznE9957TzWQosJJ9Z40jr311lvt+6QkUX57s2bNQlZWlroO0vWFh4ejcePGOHr0KMqL19dRyx+9/LGvWrXKqThStlnvRWVB2t5JkJYi29WrV6N+/fpGJ8mtyd+rBBq6tt69e6sqAymJ0Jf27dur+lZ5zCBdfGlpaTh27Bhq1qyJ8uL1OWohXbOk+Ex+uB07dsTMmTNVA5VRo0YZnTSX/8E63lWeOHFC/dFLo6ioqChD0+bqxd3ffPMN/ve//6k6roSEBLVf5roNCgoyOnkubdKkSaooUn5fqamp6jyuXbsWv/zyi9FJc2nyOyvYBiI4OBhVqlRh24jrmDhxohovQoq7z5w5o7rxyo3N8OHDUV4YqAE88MADOH/+PCZPnqwumm3atMHy5cuvamBGzqQfa8+ePZ1ueITc9EjjC7r2oB2iR48eTvu//PJLPProowalyj1I8e0jjzyiGvXIjY3UGUqQ7tu3r9FJIw8VHx+vgvLFixdRrVo1dOvWTY2BII/LC2fPIiIicmFeX0dNRETkyhioiYiIXBgDNRERkQtjoCYiInJhDNREREQujIGaiIjIhTFQExERuTAGaiIiIhfGQE1EN81kMmHJkiVGJ4PIIzFQE7k5GXZUAmXBpX///kYnjYhKAcf6JvIAEpRlrHBHAQEBhqWHiEoPc9REHkCCssyf7rhUqlRJPSe5a5kIRGadktm5GjRogIULFzq9XqZA7NWrl3peZlQaPXq0mh3N0RdffIEWLVqoz5Ip/mSqTkcXLlzA0KFDUaFCBTRq1Ag//PCD/blLly6pKRVlIgP5DHm+4I0FERWOgZrIC7zyyiu49957sXv3bhUw//znP+PgwYPqOZnS9c4771SBfevWrfjuu++wcuVKp0AsgV6m55QALkFdgnDDhg2dPmPatGkYNmwY9uzZg7vuukt9TlJSkv3zDxw4gGXLlqnPlferWrVqOZ8FIjcls2cRkfsaOXKk5uPjowUHBzstr7/+unpe/syfeOIJp9d06tRJGzNmjHr86aefapUqVdLS0tLsz//888+a2WzWEhIS1HZkZKT20ksvXTMN8hkvv/yyfVveS/YtW7ZMbQ8aNEgbNWpUKX9zIu/AOmoiDyDzguvzXOsqV65sf9y5c2en52R7165d6rHkcKOjoxEcHGx/vmvXrrBYLIiJiVFF52fOnEHv3r2LTIPMDa2T9woNDVXzR4sxY8aoHP2OHTvQr18/DBkyBF26dLnJb03kHRioiTyABMaCRdGlReqUi8PPz89pWwK8BHsh9eOnTp3C0qVLsWLFChX0pSj9X//6V5mkmciTsI6ayAts3rz5qu1mzZqpx7KWumupq9Zt3LgRZrMZTZo0QUhICOrVq4dVq1bdVBqkIdnIkSPx9ddfY+bMmfj0009v6v2IvAVz1EQeICsrCwkJCU77fH197Q22pIFY+/bt0a1bN8ydOxdbtmzBv//9b/WcNPqaMmWKCqJTp07F+fPnMX78eDz88MOoXr26Okb2P/HEE4iIiFC549TUVBXM5bjimDx5Mtq1a6dajUtaf/rpJ/uNAhEVjYGayAMsX75cdZlyJLnhQ4cO2Vtkz5s3D08++aQ67ttvv0Xz5s3Vc9Kd6pdffsHTTz+NDh06qG2pT37nnXfs7yVBPDMzE++++y4mTpyobgDuu+++YqfP398fkyZNwsmTJ1VRevfu3VV6iOj6TNKirBjHEZGbkrrixYsXqwZcROR+WEdNRETkwhioiYiIXBjrqIk8HGu3iNwbc9REREQujIGaiIjIhTFQExERuTAGaiIiIhfGQE1EROTCGKiJiIhcGAM1ERGRC2OgJiIicmEM1ERERHBd/w85L3y7hZr3qQAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 500x300 with 2 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from previous_chapters import plot_values\n",
    "# Alternatively:\n",
    "# from llms_from_scratch.ch06 import plot_values\n",
    "\n",
    "epochs_tensor = torch.linspace(0, num_epochs, len(train_losses))\n",
    "examples_seen_tensor = torch.linspace(0, examples_seen, len(train_losses))\n",
    "\n",
    "plot_values(epochs_tensor, examples_seen_tensor, train_losses, val_losses, label=\"loss\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa074723-e3f7-4f7e-a267-855531a037dc",
   "metadata": {
    "id": "aa074723-e3f7-4f7e-a267-855531a037dc"
   },
   "source": [
    "- Note that we previously calculated the accuracy values on 5 batches only via the `eval_iter=5` setting; below, we calculate the accuracies on the full dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "1D2awlEq0gZi",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "1D2awlEq0gZi",
    "outputId": "d603eda1-d912-43eb-ec9c-af6a622510a0"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training accuracy: 99.81%\n",
      "Validation accuracy: 97.99%\n",
      "Test accuracy: 97.33%\n"
     ]
    }
   ],
   "source": [
    "train_accuracy = calc_accuracy_loader(train_loader, model, device)\n",
    "val_accuracy = calc_accuracy_loader(val_loader, model, device)\n",
    "test_accuracy = calc_accuracy_loader(test_loader, model, device)\n",
    "\n",
    "print(f\"Training accuracy: {train_accuracy*100:.2f}%\")\n",
    "print(f\"Validation accuracy: {val_accuracy*100:.2f}%\")\n",
    "print(f\"Test accuracy: {test_accuracy*100:.2f}%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f87f5e6-339e-4fcf-900b-6d845d3c713d",
   "metadata": {
    "id": "1f87f5e6-339e-4fcf-900b-6d845d3c713d"
   },
   "source": [
    "- As we can see based on the relatively high accuracy values above, the LoRA finetuning was successful"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "V100",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
